An Information-Theoretic Analysis of Thompson Sampling
نویسندگان
چکیده
We provide an information-theoretic analysis of Thompson sampling that applies across a broad range of online optimization problems in which a decision-maker must learn from partial feedback. This analysis inherits the simplicity and elegance of information theory and leads to regret bounds that scale with the entropy of the optimal-action distribution. This strengthens preexisting results and yields new insight into how information improves performance.
منابع مشابه
Horvitz-Thompson estimator of population mean under inverse sampling designs
Inverse sampling design is generally considered to be appropriate technique when the population is divided into two subpopulations, one of which contains only few units. In this paper, we derive the Horvitz-Thompson estimator for the population mean under inverse sampling designs, where subpopulation sizes are known. We then introduce an alternative unbiased estimator, corresponding to post-st...
متن کاملAn Entropy Search Portfolio for Bayesian Optimization
Portfolio methods provide an effective, principled way of combining a collection of acquisition functions in the context of Bayesian optimization. We introduce a novel approach to this problem motivated by an information theoretic consideration. Our construction additionally provides an extension of Thompson sampling to continuous domains with GP priors. We show that our method outperforms a ra...
متن کاملLearning to Optimize via Information-Directed Sampling
This paper proposes information directed sampling–a new algorithm for balancing between exploration and exploitation in online optimization problems in which a decision-maker must learn from partial feedback. The algorithm quantifies the amount learned by selecting an action through an information theoretic measure: the mutual information between the true optimal action and the algorithm’s next...
متن کاملCombination of real options and game-theoretic approach in investment analysis
Investments in technology create a large amount of capital investments by major companies. Assessing such investment projects is identified as critical to the efficient assignment of resources. Viewing investment projects as real options, this paper expands a method for assessing technology investment decisions in the linkage existence of uncertainty and competition. It combines the game-theore...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of Machine Learning Research
دوره 17 شماره
صفحات -
تاریخ انتشار 2016